Search | VHL Regional Portal

Using machine learning on clinical data to identify unexpected patterns in groups of COVID-19 patients.

Cowley, Hannah Paris; Robinette, Michael S; Matelsky, Jordan K; Xenes, Daniel; Kashyap, Aparajita; Ibrahim, Nabeela F; Robinson, Matthew L; Zeger, Scott; Garibaldi, Brian T; Gray-Roncal, William.

Sci Rep ; 13(1): 2236, 2023 02 08.

Article in English | MEDLINE | ID: mdl-36755135

ABSTRACT

As clinicians are faced with a deluge of clinical data, data science can play an important role in highlighting key features driving patient outcomes, aiding in the development of new clinical hypotheses. Insight derived from machine learning can serve as a clinical support tool by connecting care providers with reliable results from big data analysis that identify previously undetected clinical patterns. In this work, we show an example of collaboration between clinicians and data scientists during the COVID-19 pandemic, identifying sub-groups of COVID-19 patients with unanticipated outcomes or who are high-risk for severe disease or death. We apply a random forest classifier model to predict adverse patient outcomes early in the disease course, and we connect our classification results to unsupervised clustering of patient features that may underpin patient risk. The paradigm for using data science for hypothesis generation and clinical decision support, as well as our triaged classification approach and unsupervised clustering methods to determine patient cohorts, are applicable to driving rapid hypothesis generation and iteration in a variety of clinical challenges, including future public health crises.

Subject(s)

COVID-19 , Humans , COVID-19/epidemiology , Pandemics , Machine Learning , Patients , Big Data

An Integrated Toolkit for Extensible and Reproducible Neuroscience.

Matelsky, Jordan K; Rodriguez, Luis M; Xenes, Daniel; Gion, Timothy; Hider, Robert; Wester, Brock A; Gray-Roncal, William.

Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 2413-2418, 2021 11.

Article in English | MEDLINE | ID: mdl-34891768

ABSTRACT

As neuroimagery datasets continue to grow in size, the complexity of data analyses can require a detailed understanding and implementation of systems computer science for storage, access, processing, and sharing. Currently, several general data standards (e.g., Zarr, HDF5, precomputed) and purpose-built ecosystems (e.g., BossDB, CloudVolume, DVID, and Knossos) exist. Each of these systems has advantages and limitations and is most appropriate for different use cases. Using datasets that don't fit into RAM in this heterogeneous environment is challenging, and significant barriers exist to leverage underlying research investments. In this manuscript, we outline our perspective for how to approach this challenge through the use of community provided, standardized interfaces that unify various computational backends and abstract computer science challenges from the scientist. We introduce desirable design patterns and share our reference implementation called intern.

Subject(s)

Datasets as Topic/standards , Neurosciences

DotMotif: an open-source tool for connectome subgraph isomorphism search and graph queries.

Matelsky, Jordan K; Reilly, Elizabeth P; Johnson, Erik C; Stiso, Jennifer; Bassett, Danielle S; Wester, Brock A; Gray-Roncal, William.

Sci Rep ; 11(1): 13045, 2021 06 22.

Article in English | MEDLINE | ID: mdl-34158519

ABSTRACT

Recent advances in neuroscience have enabled the exploration of brain structure at the level of individual synaptic connections. These connectomics datasets continue to grow in size and complexity; methods to search for and identify interesting graph patterns offer a promising approach to quickly reduce data dimensionality and enable discovery. These graphs are often too large to be analyzed manually, presenting significant barriers to searching for structure and testing hypotheses. We combine graph database and analysis libraries with an easy-to-use neuroscience grammar suitable for rapidly constructing queries and searching for subgraphs and patterns of interest. Our approach abstracts many of the computer science and graph theory challenges associated with nanoscale brain network analysis and allows scientists to quickly conduct research at scale. We demonstrate the utility of these tools by searching for motifs on simulated data and real public connectomics datasets, and we share simple and complex structures relevant to the neuroscience community. We contextualize our findings and provide case studies and software to motivate future neuroscience exploration.

Subject(s)

Connectome , Databases as Topic , Search Engine , Software , Animals , Caenorhabditis elegans/physiology , Drosophila melanogaster/physiology , Mice , Reproducibility of Results

A substrate for modular, extensible data-visualization.

Matelsky, Jordan K; Downs, Joseph; Cowley, Hannah P; Wester, Brock; Gray-Roncal, William.

Big Data Anal ; 52020.

Article in English | MEDLINE | ID: mdl-33880186

ABSTRACT

BACKGROUND: As the scope of scientific questions increase and datasets grow larger, the visualization of relevant information correspondingly becomes more difficult and complex. Sharing visualizations amongst collaborators and with the public can be especially onerous, as it is challenging to reconcile software dependencies, data formats, and specific user needs in an easily accessible package. RESULTS: We present substrate, a data-visualization framework designed to simplify communication and code reuse across diverse research teams. Our platform provides a simple, powerful, browser-based interface for scientists to rapidly build effective three-dimensional scenes and visualizations. We aim to reduce the limitations of existing systems, which commonly prescribe a limited set of high-level components, that are rarely optimized for arbitrarily large data visualization or for custom data types. CONCLUSIONS: To further engage the broader scientific community and enable seamless integration with existing scientific workflows, we also present pytri, a Python library that bridges the use of substrate with the ubiquitous scientific computing platform, Jupyter. Our intention is to lower the activation energy required to transition between exploratory data analysis, data visualization, and publication-quality interactive scenes.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL